-
Notifications
You must be signed in to change notification settings - Fork 585
feat: add Node class for serialization and implement display functionality #5158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a Node class for deserializing and displaying serialization tree structures, along with a helper function to format large numbers with suffixes. The implementation enables hierarchical visualization of model structures stored in serialized dictionaries.
Changes:
- Added
Nodeclass with deserialization methods (deserialize,from_dict,from_list) to convert dictionary/list structures into tree nodes - Implemented display functionality via
__str__method that shows hierarchical structure with size information - Added
format_big_numberhelper function to format large numbers with K/M/B suffixes - Added test case to verify the Node display functionality
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| deepmd/dpmodel/utils/serialization.py | Implements Node class for tree deserialization and display, plus format_big_number utility |
| source/tests/common/dpmodel/test_network.py | Adds test for Node display functionality and updates test data with @Class field |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a big-number formatter and a new Node class to build, size, deserialize, and pretty-print serialization trees; updates tests to exercise Node deserialization/display and adds an Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…md-kit into serialization-display
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 299-301: The loop handling data_dict currently deep-copies the
"@variables" payload (variables = deepcopy(vv)), which unnecessarily duplicates
large numpy arrays; change this to a shallow copy instead so we don't duplicate
underlying array buffers—e.g., assign a shallow copy appropriate to the
container type (for numpy arrays use vv.copy() or for mapping types use
dict(vv)) or simply vv if already an independent object; update the assignment
for the "@variables" branch (the variables variable assigned from
data_dict["@variables"]) to use a shallow copy strategy rather than deepcopy to
avoid doubling memory during deserialization.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepmd/dpmodel/utils/serialization.py
🧰 Additional context used
🪛 Ruff (0.14.11)
deepmd/dpmodel/utils/serialization.py
271-271: Prefer TypeError exception for invalid type
(TRY004)
271-271: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (12, 3.10)
- GitHub Check: Test Python (10, 3.10)
- GitHub Check: Test Python (7, 3.13)
- GitHub Check: Test Python (11, 3.13)
- GitHub Check: Test Python (9, 3.10)
- GitHub Check: Test Python (12, 3.13)
- GitHub Check: Test Python (3, 3.10)
- GitHub Check: Test Python (6, 3.13)
- GitHub Check: Test Python (9, 3.13)
- GitHub Check: Test Python (8, 3.13)
- GitHub Check: Test Python (10, 3.13)
- GitHub Check: Test Python (7, 3.10)
- GitHub Check: Test Python (8, 3.10)
- GitHub Check: Test Python (11, 3.10)
- GitHub Check: Test Python (4, 3.13)
- GitHub Check: Test Python (2, 3.13)
- GitHub Check: Test Python (4, 3.10)
- GitHub Check: Test Python (2, 3.10)
- GitHub Check: Test Python (3, 3.13)
- GitHub Check: Test Python (1, 3.10)
- GitHub Check: Test Python (5, 3.10)
- GitHub Check: Test Python (5, 3.13)
- GitHub Check: Test Python (6, 3.10)
- GitHub Check: Test Python (1, 3.13)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Analyze (python)
- GitHub Check: Test C++ (false, false, false, true)
- GitHub Check: Test C++ (true, true, true, false)
- GitHub Check: Test C++ (true, false, false, true)
- GitHub Check: Test C++ (false, true, true, false)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
🔇 Additional comments (7)
deepmd/dpmodel/utils/serialization.py (7)
7-12: Import additions look consistent.
deepcopyandcached_propertyare both used later in the file.
181-201: Formatting helper is clear.The thresholds and suffixes are straightforward and readable.
214-225: Constructor is straightforward.The field initialization is clear and minimal.
226-250: Size aggregation logic is easy to follow.Counting variables plus child sizes is clean and readable.
252-271: Deserialize routing is clean.The dict/list branching is concise and readable.
314-337: List deserialization is clear.The handling of nested dict/list entries is straightforward.
339-364: Pretty‑print logic is readable.The indentation and merging logic are easy to follow.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5158 +/- ##
=======================================
Coverage 81.92% 81.93%
=======================================
Files 714 714
Lines 73301 73397 +96
Branches 3616 3617 +1
=======================================
+ Hits 60055 60140 +85
- Misses 12083 12094 +11
Partials 1163 1163 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 348-358: The current merge logic in the loop over self.children
(using children_buff, child_repr, last_repr) collapses distinct dict keys when
their stringified child subtrees match; restrict merging to only list-type
parents by checking the parent node type before merging (e.g., only perform the
"if child_repr == last_repr: ... merge ..." sequence when the current node
represents a list node), or add a guard that inspects the key type/format (the
kk variable) to ensure keys are positional indices before combining; update the
loop around children_buff, kk, vv and the merge branch so dict children are
never concatenated into "a, b" while preserving the existing merge behavior for
true list nodes.
- Around line 181-201: The format_big_number function can produce results like
"1000.0K" when rounding crosses a unit boundary; update format_big_number to
detect when the rounded quotient equals or exceeds 1000.0 and promote to the
next suffix (e.g., convert 1000.0K → 1.0M, 1000.0M → 1.0B). Concretely, compute
the quotient for each branch (q = x / 1_000, 1_000_000, 1_000_000_000),
round/format it to one decimal, and if q >= 1000.0, fall through to the next
unit branch so the final string uses the higher suffix; keep the same
one-decimal formatting and ensure the else branch returns str(x).
♻️ Duplicate comments (1)
deepmd/dpmodel/utils/serialization.py (1)
299-302: Avoid deep-copying large@variablespayloads.
deepcopy(vv)duplicates ndarray buffers and can be very expensive for large models. A shallow copy should be sufficient if Node is read-only or if@variablesalready owns its arrays.💡 Suggested change
- if kk == "@variables": - variables = deepcopy(vv) + if kk == "@variables": + # Shallow copy to avoid duplicating large ndarray payloads + variables = vv.copy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 348-378: The closure format_value currently captures the loop
variable ii from the outer for loop (line with self.children.items()), causing
wrong indentation; rename that outer ii to _ (unused) and change format_value to
accept ii as a parameter (e.g., def format_value(vv: str, ii: int) -> str) and
update the generator that builds buff to call format_value(vv, ii) using the ii
from enumerate(children_buff); keep format_list_keys and children_buff logic
unchanged (ListNode handling remains the same).
♻️ Duplicate comments (3)
deepmd/dpmodel/utils/serialization.py (3)
181-201: LGTM with minor caveat.The function correctly formats large numbers with appropriate suffixes. The boundary rounding issue (where values near thresholds could display as "1000.0K" instead of "1.0M") was already noted in a previous review.
214-250: LGTM!The constructor properly initializes all fields, and
sizeis appropriately cached. The unused return value incount_variables(line 241) was already noted in a previous review.
273-311: LGTM!The node construction logic correctly handles class naming, variable extraction, and recursive child building. The
deepcopyconcern for@variables(line 301) was already addressed in a previous review.
🧹 Nitpick comments (1)
deepmd/dpmodel/utils/serialization.py (1)
266-271: Consider usingTypeErrorfor invalid type.When the input is neither a
dictnor alist, this is a type mismatch rather than an invalid value. Per Python conventions,TypeErroris more appropriate here.♻️ Suggested fix
else: - raise ValueError("Cannot deserialize Node from non-dict/list data.") + raise TypeError(f"Cannot deserialize Node from {type(data).__name__}; expected dict or list.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]> Signed-off-by: Jinzhe Zeng <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.